COMP124 � Week 9.1 Paging & Caching

Paging Memory Model

Paging splits physical memory into equal-sized blocks

Process address space spread across as many pages as necessary
Very similar to how files are stored across blocks on a disk
Process memory addresses will appear to be one contiguous range
Logical addresses numbered from 0 to N (process limit)
In reality they will be spread over multiple pages
The pages could be anywhere in memory (do not need to be consecutive)
The physical memory used by a page is called the page frame
Each process has a page table that maps logical pages onto physical memory frames
Needs to be stored in memory as part of the process image
Address of table stored in page table base register within the CPU

Page Mapping

Frames can be anywhere in the real physical memory
But the virtual pages look like a contiguous address space
![[Pasted image 20250519051546.png]]

Page Table Lookups

![[Pasted image 20250519051759.png]]

Segmentation and Paging

Segmentation...

Logical division of address space
Segments can be different sizes
Segments match the structure of the program
Paging...
Physical division of address space
Each page is the same size
Pages have no relation to program structure

Advantages of Paging

Memory allocation is much easier with fixed size units (pages)
Fragmentation is eliminated (but with some wasted space in final page of a process)

Segmentation with Paging

The Intel x86 MMU supports segmentation with paging

Process memory is split into logical segments
Segments are stored inside pages
Segmentation concepts still apply but with a paged memory layout

CPU sends logical address to Segmentation Unit
Segmentation Unit sends virtual address to Paging Unit
Paging Unit sends physical address to MMU

Page Swapping

The OS can allocate more memory than is physically installed in the system

Only pages that are used by the current process need to be in main memory
Other pages can be stored on disk (in swap space)
Allows the system to have as many processes as it liked in the round robin queue
The OS doesn't necessarily know which pages will be needed at any time
If a process tries to access (eg. jump, read, write) a page that's not in main memory
A page fault occurs
This is not an error
Every page fault generates an interrupt
Interrupt handler catches the page fault
Correct page is loaded (or swapped) from disk into main memory
Known as demand paging (pages brought in when needed)
Some systems use anticipatory paging (bringing in more pages than needed)

Page Replacement Problem

When a page is brought from disk to memory

Room might have to be made for it by swapping out an unused page
The unused page to swap out needs to be chosen methodically

OS will have a page replacement policy

Least Recently Used (LRU) - the page that has been unused for the longest period
First in First out (FIFO) - the page that has been in memory for the longest period
Least frequently used (LFU) - the page with the fewest references in recent period
Second chance - Uses FIFO but keeps pages that were recently used

The memory manager might get it wrong

Can only make a sensible guess based on previous process history and page usage
Leads to more page faults and disk thrashing as pages are swapped

Principle of Locality

The optimal policy is to swap out the page that won't be needed for the longest time

Can only ever be based on a best guess
Cannot predict the future
But can use the principle of locality as a guide

The Principle of Locality: "Over any short period of time, a process's memory references tend to be spatially localised"

In other words, memory addresses that a process refers to in the next time period are likely to be close to the addresses used in the current period
Cannot always be guaranteed, so page faults are a normal part of how an OS works

Working Set

Following the principle of locality

Instructions don't usually jump all over the place
Code within a loop spends time in the same area of memory
Accessing an array or other data structure will be focusses on the same memory area
Generally, execution stays within 20% of a process's pages for 80% of the time
The working set of a process is defined as the set of pages W(T, s) used in the time period from T to T+s
For example, W(12,4) is the working set of pages used by a process in the time period from 12 to 16 (inclusive)

Predicting the Working Set

Memory manager knows which pages have been used in which time periods

Page	z	a	b	a	a	a	z	b	b	a
Time	0	1	2	3	4	5	6	7	8	9
The working set W(3,3) here is from time 3 to time 6: az
By the principle of locality, the working set for the next time period is likely to be similar to the one for the current period
For each process, the memory manager

Tries to ensure its working set is in memory
Estimates its next working set by looking at the set used in the preceding period
The working set W(10,2) here would be ab
This is because we are trying to predict the working set for the next 3 units of time, so we look at the working set for the previous 3 units of time, where the working set is ab

Working Set Accuracy

Accuracy of the working set depends on its size

Too small - won't cover the entire current locality for a process
Too big - will cover several localities

Working set of a process will change as execution moves from one locality to another

References to different data structures
Subroutine calls to different parts of the code

Number of page faults will increase during this transition between localities (which is an expected part of how memory management works)

Page Size

Large pages require a smaller page table but increase the amount of wasted memory

The final page for each process won't be entirely full
On average, 50% of a page is lost per process due to internal fragmentation
Smaller pages make better use of memory and swap space
Less fragmentation and wastage
Bring in only the code and data needed for a working set
But small pages increase the chance of page faults
Code and data are more likely to be in another page
More pages to be swapped in and out

Windows and Linux both use 4 KB pages by default

Cache Memory

Storage hierarchy decreases in speed (and cost) as we move away from the CPU

The fastest parts of the system are inside the CPU itself (registers)
The slowest parts are the mechanical disk drives (HDDs)
Main memory is slow compared to registers, but way faster than any disk
Efficiency (speed) can be improved by keeping things as close to the CPU as possible

The cache is a small, fast memory area that sits between the CPU and main memory

Stored inside the CPU itself
Contains copies of items in main memory
Caches and co-exist at multiple layers of abstraction (hardware device, device controller, device driver, kernel, etc.) as well as the main CPU caches

Cache Hits and Misses

When the CPU requests data from memory, first it checks the cache
If the data is in the cache

This is a cache hit
CPU can immediately read the data from the cached copy (very fast)
If the data is not in the cache
This is a cache miss
MMU will copy a data block from main memory into the cache (with a time penalty)
Block will include memory locations around the requested address
CPU can then read the data from this cached version
By the principle of locality...
Its likely that future data items were brought into the cache as part of the block
Chances of cache hits are improved

Cache Levels and Latency

Modern processes are incredibly fast
Memory access is incredibly slow by comparison
CPU can execute hundreds of instructions in the time it takes to fetch a data block from main memory into the cache (latency)
Larger caches have better hit rates but longer latency

Modern CPUs have multiple levels of cache

Small and fast cache (L1) closest to the CPU registers
Backed up by bigger, slower caches (L2 and L3)
In multi-core processors
Each core has its own L1 cache
Each pair of cores shares a slightly bigger L2 cache
All cores share a larger L3 cache

Memory Hierarchy

From Expensive/Fast/Small to Cheap/Slow/Large

CPU
- Registers
Cache
- L1
- L2
- L3
System Memory
- RAM
Solid State Memory
- SSD
Mechanical Memory
- HDD